Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval

نویسندگان

Sebastian Schuster

Ranjay Krishna

Angel X. Chang

Li Fei-Fei

Christopher D. Manning

چکیده

Semantically complex queries which include attributes of objects and relations between objects still pose a major challenge to image retrieval systems. Recent work in computer vision has shown that a graph-based semantic representation called a scene graph is an effective representation for very detailed image descriptions and for complex queries for retrieval. In this paper, we show that scene graphs can be effectively created automatically from a natural language scene description. We present a rule-based and a classifierbased scene graph parser whose output can be used for image retrieval. We show that including relations and attributes in the query graph outperforms a model that only considers objects and that using the output of our parsers is almost as effective as using human-constructed scene graphs (Recall@10 of 27.1% vs. 33.4%). Additionally, we demonstrate the general usefulness of parsing to scene graphs by showing that the output can also be used to generate 3D scenes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scene Graph Parsing as Dependency Parsing

In this paper, we study the problem of parsing structured knowledge graphs from textual descriptions. In particular, we consider the scene graph representation (Johnson et al., 2015) that considers objects together with their attributes and relations: this representation has been proved useful across a variety of vision and language applications. We begin by introducing an alternative but equiv...

متن کامل

Ontology-Based Image Retrieval

The binary form of an image does not tell what the image is about. It is possible to retrieve images from a database using pattern matching techniques, but usually textual descriptions attached to the images are used. Semantic web ontology and metadata languages provide a new way to annotating and retrieving images. This paper considers the situation when a user is faced with an image repositor...

متن کامل

Evaluating a text-to-scene generation system as an aid to literacy

We discuss classroom experiments using WordsEye, a system for automatically generating 3D scenes from English textual descriptions. Input is syntactically and semantically processed to identify a set of graphical objects and constraints which are then rendered as a 3D scene. We describe experiments with the system in a summer literacy enrichment program conducted at the Harlem Educational Activ...

متن کامل

Graph Grammar Based Object Recognition for Image Retrieval

| In order to retrieve a set of intended images from an image archive, human beings think of special contents with respect to the searched scene. The necessity of a semantics-based retrieval leads to a content-based analysis and retrieval of images. From this point of view, our project Image Retrieval for Information Systems (IRIS) develops and combines methods and techniques of computer vision...

متن کامل

Image Generation from Scene Graphs

To truly understand the visual world our models should be able not only to recognize images but also generate them. To this end, there has been exciting recent progress on generating images from natural language descriptions. These methods give stunning results on limited domains such as descriptions of birds or flowers, but struggle to faithfully reproduce complex sentences with many objects a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval

نویسندگان

چکیده

منابع مشابه

Scene Graph Parsing as Dependency Parsing

Ontology-Based Image Retrieval

Evaluating a text-to-scene generation system as an aid to literacy

Graph Grammar Based Object Recognition for Image Retrieval

Image Generation from Scene Graphs

عنوان ژورنال:

اشتراک گذاری